Introduction
The following is intended as a set of tips for people learning how to use Git and GitHub.
There are many excellent guides to Git and GitHub online, e.g.,
- Intro to GitHub here
- GitHub Training & Guides YouTube channel here
- Git documentation and training here
- Hadley Wickham on Git here
- Jenny Bryan on Git and GitHub with R here
And most relevantly the OpenSAFELY documentation here.
These tips are meant to supplement them.
Session aims
By the end of the session you should
- have a basic understanding of how Git works
- be able to perform common Git operations using GitHub Desktop, including
- clone a repo from GitHub
- make a new branch
- make commits
- push your branch to GitHub
- make a pull request
Tips
Intro to Git
- Git was written to allow developers work on the source code of the Linux kernel
- One kernel release they got in a terrible mess
- This provoked Linus Torvalds into action
- For an excellent insight into his thinking watch this talk he gave at Google here
- Git was designed to work with text files
- (Especially if used at the command line) Git can be intimidating to use and we can get Git errors (which like LaTeX and R errors can be quite cryptic)

- A Git repository is a folder/directory on your computer which has been Git initialised
- Git is commonly referred to as version control software
- Git is better described as a content addressable filesystem which translates to Git tracks the contents of the files in your repo
Git takes snapshots of your files - when you tell it to - commits
Commits are identified by the SHA-1 hash of the contents of your files at that time



Git knows the state of your files at every commit
- Can easily restore files to a previous state
For Git the state of your files only changes when their contents change
- If you reopen a file, make no changes, then resave it, Git will show no changes
- If you add an empty folder/directory to your repo Git will detect no changes in your repo
- This differs to OneDrive/SharePoint/Google Drive which are file synchronisation systems
I recommend to not place your Git repos in a location that is sync’d by either OneDrive or Google Drive (they are very different syncing technologies to Git)
The .git folder
- When you initialise a directory the
.git folder is created
- This contains all of the files Git uses to track the contents of your files
- Here is the
.git folder of a repo on my computer (I have selected to View hidden files in Windows Explorer)

- Confusingly GitHub hides the
.git folder from view

- Here are its contents - never edit these manually

- Explanation of these is (from here)

Common Git commands
- I recommend you use GitHub Desktop instead of these commands
- These commands are what GitHub Desktop is using behind the scenes
- Git is the name of the program,
git is the name of the executable available at your command line
git init
git add <filename>
git status
git commit -m "Your commit message"
git commit --amend -m "Your amended commit message"
git push
git pull
git clone
git branch
git checkout
git merge
git fetch
Installing Git and GitHub Desktop
Installing Git
- Windows
- Download and install from here
- macOS comes with an out-dated version of Git
I recommend installing the Homebrew version
First install Homebrew, see instructions here
Then run in your Terminal app
brew upgrade
brew install git
Additionally on a Mac it is helpful to install Xcode command line tools (i.e., avoid installing the whole of Xcode.)
xcode-select --install
- Must reinstall these everytime upgrade operating system versions, e.g., from Big Sur to Monterey
- Once Git is installed its executable (called
git) should be available at your command line
Check which version you have with (you want something recent-ish)
git --version
On my Windows machine I have
git version 2.33.1.windows.1
Installing GitHub Desktop
- You could use Git through its command syntax however I recommend you use a graphical git editor
- For Windows and macOS download and install GitHub Desktop from here
- A Linux version of GitHub Desktop is available from here
Intro to GitHub
GitHub is a Git web server, there are others e.g., GitLab
Your repositories will be stored on GitHub, and you will clone them to your machine to work on them (or work on them in Gitpod)
Under your user account you see the repos you are owner of
On GitHub OpenSAFELY is an organization
- The repos are owned by the organization so they show up under the organisation here

GitHub PAT for R
- To create a GitHub Personal Access Token (PAT) to be allowed more downloads from GitHub per hour run in R
install.packages("usethis")
library(usethis)
create_github_token()
GitHub CLI
- GitHub CLI stands for command line interface for operating GitHub
- Installation instructions are here
- But I don’t recommend using this
Git and GitHub Workflow
Standard GitHub workflow
- (I recommend to only fork a public repo if you intend to send a pull request to it)
- Fork the other person’s repo (this will be known as the
upstream repo from your fork)
- This creates a copy of their repo under your account (your fork)
- Clone your fork (the copy under your account) to your machine
- Create a new branch (do not work on
master/main)
- Make your changes and commit them
- Push your new branch upto your GitHub (i.e., to your fork)
- Create a pull request (from your new branch) back to the default (
master/main) branch of the original repo
Workflow with an OpenSAFELY GitHub repo
- Skip the forking step from the standard GitHub workflow
- The repo on GitHub is known as
origin
- Clone the repo to your local machine
- Click:
Code | Open with GitHub Desktop

- Click
Clone in the box which appears in GitHub Desktop

- In GitHub Desktop (i.e. locally) make a new branch

- Do some work
- Make some changes (to your
project.yaml/study_definition.py/R scripts)
- In GitHub Desktop select relevant changed lines and make small-ish commits with sensible commit messages
- Do not commit changes to many files with a single commit message such as “Edits”!

- Note that in a commit we can see the added lines - green highlight with
+ prefix - and deleted lines - red highligh with - prefix

- Push your new branch from your local machine up to GitHub
- Make a pull request from your branch to the default branch
Making a pull request
- Let’s start by creating a new branch

- We do some work and make a new commit which adds the new file to the repo

- Next publish the new branch to GitHub

- Now initiate the creation of the PR by either clicking in GitHub Desktop “Create Pull Request”

- or clicking on the button on the repo webpage “Compare & pull request”

- Edit the title box, add some extra text in the comment box, select a reviewer, and then click “Create pull request”

- You can amend/edit pull requests by modifying/adding commits to the branch from which you sent the PR
- See more about pull request reviews here
- Merge PR

- Confirm the merge

- (Optional) Delete the branch the PR came from

- The PR is now finished and we can see the merge commit in the default (
main/master) branch 
Common errors
Forgetting to pull down the latest changes from GitHub
Merge conflict
See
- About merge conflicts here
- Resolving a merge conflict here
OpenSAFELY repositories
- OpenSAFELY is a system of Python packages which run various Docker containers
- The main GitHub organisation page is here
- All the core code is published in their opensafely-core organisation on GitHub here
- And there is also their opensafely-actions organisation here
- A Docker container is a like a virtual machine
- It defines the operating system and programs running within it
- On my Windows 10 machine I can run an Ubuntu docker container
- Just because an R package is installed in the R installation on your machine does not mean that it is installed in the OpenSAFELY R Docker container
Getting started
- See OS page here
- If creating a new repo create from the OS template here

- This is already Git initialized
Running jobs (on the dummy data)
- In your OS repo online
- On your own machine - install the following free software
- (If on Windows - Windows Subsystem for Linux version 2)
- Docker Desktop
- Python
- Git
- GitHub Desktop
- VSCode text editor
Additional topics
Writing good commit messages
- Follow the standard recommendations about making commit messages, see
Files for Git to ignore
- You should not commit all files in the folder on your computer into your repo
- The
.gitignore file is a list of files and folders in your repo for Git to ignore
- Common files to ignore are
GitHub repos contain more than just code
- A repo for an R package will probably contain
- The code for the R package
- The code for its website (often made with pkgdown and hosted with GitHub Pages or Netlify)

- Scripts for controlling continuous integration services such as GitHub Actions